Finding ID | Version | Rule ID | IA Controls | Severity |
---|---|---|---|---|
V-257523 | CNTR-OS-000210 | SV-257523r960915_rule | Medium |
Description |
---|
It is critical that when the container platform is at risk of failing to process audit logs as required that it takes action to mitigate the failure. Audit processing failures include software/hardware errors, failures in the audit capturing mechanisms, and audit storage capacity being reached or exceeded. Responses to audit failure depend upon the nature of the failure mode. Because availability of the services provided by the container platform, approved actions in response to an audit failure are as follows: (i) If the failure was caused by the lack of audit record storage capacity, the container platform must continue generating audit records if possible (automatically restarting the audit service if necessary), overwriting the oldest audit records in a first-in-first-out manner. (ii) If audit records are sent to a centralized collection server and communication with this server is lost or the server fails, the container platform must queue audit records locally until communication is restored or until the audit records are retrieved manually. Upon restoration of the connection to the centralized collection server, action must be taken to synchronize the local audit data with the collection server. |
STIG | Date |
---|---|
Red Hat OpenShift Container Platform 4.12 Security Technical Implementation Guide | 2024-06-10 |
Check Text ( C-61258r921510_chk ) |
---|
Verify there is a Prometheus rule to watch for audit events by executing the following: oc get prometheusrule -o yaml --all-namespaces | grep apiserver_audit Output: sum by (apiserver,instance)(rate(apiserver_audit_error_total{apiserver=~".+-apiserver"}[5m])) / sum by (apiserver,instance) (rate(apiserver_audit_event_total{apiserver=~".+-apiserver"}[5m])) > 0 If the output above is not displayed, this is a finding. |
Fix Text (F-61182r921511_fix) |
---|
Apply the following Prometheus rule by executing the following: oc apply -f - << 'EOF' --- # platform = multi_platform_ocp apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: audit-errors namespace: openshift-kube-apiserver spec: groups: - name: apiserver-audit rules: - alert: AuditLogError annotations: summary: |- An API Server instance was unable to write audit logs. This could be triggered by the node running out of space, or a malicious actor tampering with the audit logs. description: An API Server had an error writing to an audit log. expr: | sum by (apiserver,instance)(rate(apiserver_audit_error_total{apiserver=~".+-apiserver"}[5m])) / sum by (apiserver,instance) (rate(apiserver_audit_event_total{apiserver=~".+-apiserver"}[5m])) > 0 for: 1m labels: severity: warning EOF |